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Abstract — A linear time approximate maximum likelihood 
decoding algorithm on tail-biting trellises is presented, that 
requires exactly two rounds on the trellis. This is an adaptation of 
an algorithm proposed earlier with the advantage that it reduces 
the time complexity from 0(m log m) to O(m) where m is the 
number of nodes in the tail-biting trellis. A necessary condition 
for the output of the algorithm to differ from the output of the 
ideal ML decoder is deduced and simulation results on an AWGN 
channel using tail-biting trellises for two rate 1/2 convolutional 
codes with memory 4 and 6 respectively, are reported. 



I. Introduction 

Maximum likelihood decoding on tail-biting trellises (TBT) 
has been extensively studied in the literature and several linear 
time approximate algorithms have been proposed, (see for 
example, [7], [6], [10], [9], [3]). Some of these algorithms may 
fail to converge on certain inputs. Algorithms with guaranteed 
convergence were studied in [11], but they fail to achieve linear 
complexity. In particular, although the approximate algorithm 
proposed in [11], achieves performance close to an ideal ML 
decoder, it has a worst case time complexity of 0(m log m), 
where m is the number of nodes in the TBT. The algorithm 
exploits the fact that a linear tail-biting trellis can be viewed as 
a coset decomposition of the group corresponding to the linear 
code with respect to a specific subgroup and is an adaptation 
of the classical A* algorithm. The algorithm operates in two 
phases. The first phase does a Viterbi-like pass on the TBT to 
obtain certain estimates which are used in the second phase 
to guide the search for the shortest path corresponding to a 
codeword in the TBT. 

In this note, the complexity of the approximate algorithm 
in [11] is reduced to 0(m). The reduction in complexity is 
achieved by eliminating the use of a heap in the second phase 
of the original algorithm using the well known technique of 
dynamic programming. The estimates gathered during the first 
phase are used in the second phase for the computation of a 
metric for each node in the TBT using another simple Viterbi- 
like pass. It turns out that updates performed by the two 
algorithms are identical for the shortest path which must be 
output by the algorithm (although the metric values computed 
for other nodes may differ). 

We give an analysis of the algorithm here. Simulations are 
included for completeness and the two algorithms perform 
identically as expected. 



II. Background 



A linear tail-biting trellis for an (n, k) linear block code C 
over field F q can be constructed as a trellis product [5] of the 
representation of the individual trellises corresponding to the k 
rows of the generator matrix G for C [4]. The trellis product T 
of a pair of trellises T\ and T2 will have at Timeindex(i) a set 
of vertices which is the Cartesian product of vertices of T\ and 
T2 at that time index, with an edge between Timeindex(i) 
and Timeindex(i + 1) from (vi,i>2) to (v[,v' 2 ), with label 
(a + a') whenever (vi,v' x ) and (v2,v' 2 ) are edges between 
vertices at Timeindex(i) and Timeindex(i + 1) in Xi and 
T2 with labels a and a' respectively for some a, a' G F, 



< i < n — 1, where + denotes addition in F q . Let gi 



1 < 



i < k be the rows of a generator matrix G for the linear code 
C. Each vector gl generates a one-dimensional subcode of C, 
which we denote by Ci. Therefore C = C\ + C2 + ■■■ + Ck, 
and the trellis representing C is given by T = T\ x T2 x 
• ■ ■ x Tfc, where Ti is the trellis for gl, 1 < i < k. Given a 
codeword c =< cx,C2,..c n >G C, the linear span [5] of c, 
is the interval [i, j] e {1, 2, • ■ • n} which contains all the non- 
zero positions of c. A circular span [4] has exactly the same 
definition with i > j. Note that for a given vector, the linear 
span is unique, but circular spans are not. For a vector x =< 
xi, ■ ■ ■ , x n > over the field F q , there is a unique elementary 
trellis [5], [4] representing x [4]. This trellis has q vertices at 
time indices i to (j — 1) mod n, and a single vertex at other 
positions. Consequently, in the trellis product mentioned 
earlier, is the elementary trellis representing gl for some choice 
of span (either linear or circular). Koetter and Vardy [4] have 
shown that any linear trellis, conventional or tail-biting can 
be constructed from a generator matrix whose rows can be 
partitioned into two sets, those which have linear span, and 
those taken to have circular span. The trellis for the code is 
formed as a product of the elementary trellises corresponding 
to these rows. We will represent such a generator matrix as 
G = \ Gi 



G c 



where Gi is the submatrix consisting of rows 

with linear span, and G c the submatrix of rows with circular 
span. Let XJ denote the minimum conventional trellis for the 
code generated by G/. If Z is the number of rows of G with 
linear span and c the number of rows of circular span, the 
tail-biting trellis constructed using the product construction 
will have q° start states, where, each such start state defines 



a subtrellis whose codewords form a coset of the subcode 
corresponding to the subtrellis containing the all codeword. 

For the description of the decoding algorithm we assume a 
tail-biting trellis with start states sq, si . . . s t and final states 
/o, /i . . . ft- where t is the number of subtrellises. An (sj, fi) 
path is a codeword path in trellis Tj, whereas an (si, fj) 
path for i ^ j is a non codeword path. For purposes of 
our discussion we term the edge label sequence along such 
a path as a semi-codeword as in [11]. We assume an AWGN 
channel with binary antipodal signalling. When the edges 
are given weights corresponding to the log-likelihood values, 
ML decoding corresponds to finding the minimum weight 
codeword path in the TBT. 

III. The Two Phase Algorithm 

The algorithm operates in two phases, each taking linear 
time. The first phase is a Viterbi pass which computes a 
function CostQ for each vertex u in the trellis. This value 
of cost is used by the second phase to compute a metric at 
each vertex of the trellis. The final decoding decision will be 
based on the metric values at the final nodes of the trellis. 

Let l(u, v) denote the length of the shortest path connecting 
vertices u and v in the tail-biting trellis. Note that l() satisfies 
the triangular inequality, ie., l(u, v) < l{u 1 w)+l(w, v) for all 
nodes u, v, w in the trellis. A codeword is an s, — fi path while 
a semi-codeword is an Sj — fi path, i, j £ {1, ..i}, where t is 
the number of subtrellises. Note that all codewords are semi- 
codewords. Define 5(u) = mini<i<tl(si,u) We say an edge 
(w, v) £ Section(i) if v £ Timeinde(i). Define the metric at 
node u for trellis i rrii(u) — l(si,u) + 5(fi) — 8(u). Define 
metric at node u, m(u) = min\<i<.tfni{u). 

Suppose 5{u) = x and this is the length of an Si — u path, 
the first phase of the algorithm assigns the program variable 
Cost[u] the value x and SurvTrellis[u] the value i. We 
call the the Sj — u path corresponding to this assignment the 
survivor at u. 

These values are used to assign values to the program 
variable M etric[u] in the second phase, which is intended to 
store the value of the metric m{u). The trellis corresponding to 
the minimum metric value is stored in the variable Trellis[u]. 
However, the values assigned to Metric[] can be incorrect, in 
that it is not equal to m(). The algorithm may even fail to 
assign a value to Metric[u] for every node u. We shall derive 
the conditions under which the algorithm may fail to decode 
correctly. 

The program variable Dist[] stores the length of the path to 
the node corresponding to the minimum value of M etric[] in 
the second phase. The program variable Pred[] used in both 
the phases stores the predecessor along the paths traced to the 
node by the algorithm in the respective phases. 

The function M ember ((u,v),i) assumed in the algorithm 
description below takes as input an edge (u, v) and integer 
i and returns TRUE if the edge (u, v) belongs to trellis Ti, 
FALSE otherwise. Note that the function MemberQ needs 
only 0(1) lookup time although the lookup table is of size 
quadratic on the number of vertices in the trellis. 



A. Phase 1: Estimation 
Initialization: 

for each Si £ TimeIndex(0) 

Cost\s t ] = 

SurvTrellis[si] = i 

Pred[s t ] = Si 
for each v ^ Timelndex(O) cost[v] = oo 

Estimation: 

for Timeindex := 1 to n do 

for each edge (u, v) £ Section(i) do 
Temp = Cost[u] + l[u, v] 
if (Cost[v] > Temp) then 
Cost[v] = Temp 
Pred[v] = u 

SurvTrellis[v] = SurvTrellis[u] 

Clearly by the end of this phase, Cost[u] = 6(u) for each 
vertex u in the trellis. 

Let j = argmmi<i<t<5(/i). If the algorithm assigns 
SurvTrellis[fj] — j, then survivor at fj which corresponds 
to the minimum weight semi-codeword in the trellis turns out 
to be a codeword and the algorithm stops. Otherwise, the 
second phase described below will be executed. 

B. Phase 2: Revision 
Initialization: 

for each Si £ Timelndex(O) 

if (Survivor[fi] ^ i) then Metric[si] = S(fi) 
else Metric[si] — oo /* No processing for Xi */ 
Pred[si] = Sj 
Trellis[si] = i 

if (Metric[si\ = oo) then Dist[si] = oo 
else Dist[si] = 
for each v ^ Timelndex(O) Metric[v] = oo 

Revision 

for Timeindex := 1 to n do 

for each edge (u, v) £ Section(i) do 
Update(u, v) 

Update(u, v) 

if (notM ember ((u,v),Trellis[u\) return; 

temp = Dist[u] + l[u, v] + G ost[fT r eiUs[u]\ — Cost[v] 

if (Metric[v] > temp) then 

Metric[v] = temp 

Pred[v ] = u 

Trellis[v] — Trellis[u] 

Dist[v] = Dist[u] + l[u, v] 

The second phase attempts to compute the value of the 
metric, m(u) for each vertex u of the trellis. If the first phase 
assigned SurvTrellis[fi] = i for some final node /j, for 



the particular trellis Tj the second phase processing is not 
required. We say a Trellis Ti participates in the second phase if 
SurvTrellis[fi] ^ % and5(/ 4 ) < ■min^ SurvTrellls[S]]=:j 5{f ] ). 
The final decoding decision is based on the values of the metric 
at the final nodes of the trellis. We shall derive the conditions 
under which the algorithm will achieve maximum likelihood 
decoding on a tail-biting trellis for a linear code, when binary 
antipodal signaling is used over an AWGN channel. 

C. Final Decision 

If the algorithm does not stop in the first phase, choose ver- 
tex j = argmini^i^tMetriclfi]. The output of the algorithm 
is the codeword corresponding to the Sj — fj path obtained by 
tracing the predecessors of fj till Sj. The array PredQ stores 
the predecessors of each node along the path the minimizes 
the value of metric. Note that if Tj does not participate in the 
second phase, the path must be traced along PredQ values in 
the first phase. 

IV. Analysis 

For any node u, if Trellis[u] = j, then Dist[u] > l(sj,u) 
because the value assigned Dist[u] is the length of an Sj — 
u path. Consequently Metric[u] > rrij(u). We collect these 
facts into a lemma: 

Lemma 1: During the second phase, if the algorithm as- 
signs for a node u, Trellis[u] = j then Dist[u] > l(sj,u) 
and Metric[u] > rrij(u). 

The following simple property of 5() will be useful: 

Lemma 2: If (u, v) is an edge in the TBT, the S(v) < S(u) + 
l(u, v). 

Proof: The shortest path from a start node to v cannot 
be longer than the shortest path from a start node to v through 
u. m 

The following lemma asserts that the value assigned to 
Metric by the algorithm cannot be smaller than the Metric 
value of its predecessor node. 

Lemma 3: Let (u, v) be an edge in the Tail-biting Trel- 
lis. Let Trellis [u] — i Suppose the second phase assigns 
Pred[v] = u then Metric[v] > Metric[u] 

Proof: An inspection of the algorithm reveals that the 
algorithm assigns to Dist[u] the cost of some s,; — u path. 
Hence Dist[u] > l(si,u). By the Metric update rule of the 
algorithm, Metric[v] = Dist[u] +l(u, v) + 8(fi) — 5(u). Since 
Metric[u] = Dist[u] + S(fi) — S(u), the result follows as 
5(v) < S(u) + l(u, v) by lemma 2. ■ 

Corollary 1: If the algorithm assigns Trellis[u] = i, then 
Metric[u] > Metric[si\ = 5{fi) 

Proof: The algorithm initializes Metric[si] to 8(fi). By 
previous lemma, the value cannot decrease along any Sj — u 
path. ■ 

The algorithm, if assigns any value, must set Trellis[fj] = 
j and Metric[fj] = Dist[fj] > l(sj,fj) = mj(fj) for each 
j £ {l,..,i}. Thus, if the shortest path corresponding to a 
codeword in the trellis is an Sj — fj path, then if Metric[fj] = 
l(sj, fj) the algorithm is guaranteed to decode correctly. In the 



following, we derive a condition necessary for the algorithm 
to fail. 

Theorem 1: If the shortest codeword corresponds to an Si — 
fi path P, and if P corresponds to the codeword output by a 
maximum likelihood decoder, then, the two phase algorithm 
fails to assign Metric[u] = to;(w) and Trellis[u] = i for 
any node u in P only if there exists k ^ j ^ i such that 

l{s k ,fj)<l{sufi). 

Proof: Without loss of generality, assume that the all zero 
codeword was transmitted and an ideal ML decoder will output 
the all zero codeword. Again, without loss of generality let 
P= (si =)uo, u\..u n -i, u„(= /i) be the shortest s; — /; path 
in the sub-trellis T\ corresponding to the all zero codeword. 
We therefore have Z(si,/i) < l(si,fi) for all 1 < i < t. Let 
u, be the first node along the path P where there exists some 
1 < j < t such that rrij(u) < mi(u). such node u must exist 
for otherwise, the algorithm will decode correctly as it will 
assign Trellis[u.j\ = 1 with Aletric[ui] = mi(ui) all along 
the path P. 

Note that mi(/i) = l(s±, fi) is the value the algorithm 
would have assigned to Metric[fi] if the algorithm had 
assigned Trellis[ui] ~ 1 all along the path P. As the 
algorithm assigns the minimum value of Metric possible 
for each node, by lemma 3, it must be true that the actual 
value assigned to the Metric[u] by the algorithm must satisfy 
Metric[u] < i(si,/i). Since we assume that the algorithm 
assigned Trellis[u] = j, the value of the metric computed 
at u must have followed an Sj — u path and consequently 
Metric[u] > Metric[sj] = S(fj) (Corollary 1). Hence 
S(fj)<l( S iJi). 

Now, Assume that the survivor at fj is an Sk—fj path, if k = 
j, we have l{sj,fj) < i(si,/i), a contradiction. Otherwise, 
the condition stated in theorem holds. 

■ 

Now to specialize the above to AWGN channel with binary 
antipodal signaling. The following two results proved in [11] 
are repeated here for completeness. 

Lemma 4: The space of semi-codewords is a vector space. 
Proof: Assume that each of the c vectors in the submatrix 
G c of the generator matrix is of the form Vi = [hi,0, ti] 
where hi stands for the sequence of symbols before the zero 
run, and is called the head and ti stands for the sequence of 
symbols following the zero run and is called the tail and 
is the zero run containing the appropriate number of zeroes. 
Let {i>i, t>2 . . . v c } be the vectors of G c . Then the matrix G s 
' G l " 



defined as G s 



GL 



where G'. consists of 2c rows of 



the form [hi, 0], [0, ti], 1 < i < c, generates the set of labels 
of all paths from any start node to any final node. ■ 

The following is due to Tendolkar and Hartmann [8]. 

Lemma 5: Let H be the parity check matrix of the code and 
let a codeword x be transmitted as a signal vector s(x). Let 
the binary quantization of the received vector f = r\ , r-z , . . . r n 
be denoted by y. Let r' = (\ri\, |r"2 1 9 ■ ■ ■ \ r n\) and S = yH T . 
Then maximum likelihood decoding is achieved by decoding 
a received vector r into the codeword y + e where e is a binary 



vector that satisfies S = eH T and has the property that if e' is 
any other binary vector such that S = e'H T then e.r' < e'.r' 
where . is the inner product. 

Combining all the above, we have the following necessary 
condition for error. 

Theorem 2: Assume the codeword is the ML codeword 
corresponding to the path S\ —f\ in the tail biting trellis. Let y 
be the binary quantization of the received vector. Let r, r' be 
as defined in Lemma 4. For the error pattern e the two phase 
algorithm decodes to a vector to a vector a ^ correspond 
an Sj — fj path j ^ i only //there exists a semi-codeword C s 
satisfying 

(C s + e»y < e.r' < {C + ej.r 1 

for all nonzero codewords C, where the semi-codeword c s 
either shares either its head or tail with Trellis j. 

Proof: Since the ideal ML decoder decodes y to 0, we 
have y + e = 0ore = y. Let H be the parity check matrix 
of the code while H s the parity check matrix for the semi- 
codeword vector space established in Lemma 3. Any binary 
error vector e' which gives the same syndrome as e must 
belong to the same coset of the code and hence must have 
the form C + e, where C is a codeword. Applying Lemma 5, 
we get e.r' < (C + e).r' for all codewords C, which proves 
the right inequality. 

To yield the left inequality, first observe that the first phase 
of the algorithm does an ML decoding on the semi-codeword 
space. Any Sk — fj path P in the tail-biting trellis with k ^ j 
and l(sk, fj) < l(si,fx) corresponds to a semi-codeword 
that an ideal ML decoder operating on the space of semi- 
codewords will prefer to the all zero codeword. Hence, by 
applying Lemma 5 to this case and arguing identically as 
above, we find that for each path such P there must exist 
a semi-codeword C s such that (C s + e)r' < e.r'. The claim 
follows as Theorem 1 asserts that this condition is necessary 
for the algorithm to fail to decode the received vector to the 
all zero codeword. ■ 

V. Complexity 

Since each phase takes linear time, the algorithm runs in 
time linear in the size of the tail-biting trellis. As each pass is 
Viterbi like, the worst case number of comparisons performed 
is bounded by twice that of the Viterbi algorithm. The space 
complexity is quadratic in the size of the trellis owing to the 
lookup table of size t\V\ required for the member () function, 
where |V| is the number of vertices in the trellis. However, 
this is not a serious drawback as the table can be efficiently 
implemented using bit vector representation. 

VI. Simulation Results 

The results of simulations on an AWGN channel for the 
two phase algorithm are displayed in the figures below. The 
codes used are a rate 1/2 memory 6 convoluational code with 
a circle size of 48 (same as the (554,744) code convolutional 
code used in [2]) and a rate 1/2 memory 4 convolutional code 
with circle size 20 (same as the (72,62) code used in [1]). The 
performance of the above codes is compared with that of the 
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exact ML decoding algorithm in [11]. It is seen that the bit 
error rate of the algorithm approaches that of the ideal ML 
decoder. 

VII. Discussion and conclusion 

The performance of the algorithm can be improved at the 
expense of more storage by tracking more than one paths 
corresponding lowest values of Metric during the second 
second phase. However, the time complexity increases pro- 
portional to the number of stored paths. Practice has shown 
that memorizing the best two paths corresponding to the 
minimum value of Metric at each node gives performance 
almost indistinguishable from the ideal maximum likelihood 
decoder [11] 

An interesting failure condition of the algorithm is the 
following: The algorithm may fail to assign a value to the 
Metric field for a node if in the second phase a node fail 
to belong to any of the trellises assigned to the Trellis field 
of its predecessors by the algorithm. If this happens along all 
paths to all final states, the algorithm may fail to output a 
codeword in the second phase. Note that the error condition 



proved handles this case as well. However this situation never 
occurred in simulations performed. 

From the results of simulations on the rate 1/2, memory 
4 convolutional code with a circle size of 20 and a rate 1/2 
memory 6 convolutional code with a circle size of 48, it is seen 
that the algorithm performs close to the ideal ML decoder. The 
performance is comparable with other linear time approximate 
methods. The present algorithm reduces computation to just 
two Viterbi computation on the tail-biting trellis and does not 
require dynamic data structures like the heap necessary in the 
orginal versions using the A* algorithm [11]. 
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